Goto

Collaborating Authors

 Hempstead


Reading Between the Lines: Combining Pause Dynamics and Semantic Coherence for Automated Assessment of Thought Disorder

Chen, Feng, Xu, Weizhe, Li, Changye, Pakhomov, Serguei, Cohen, Alex, Bhola, Simran, Yin, Sandy, Tang, Sunny X, Mackinley, Michael, Palaniyappan, Lena, Ben-Zeev, Dror, Cohen, Trevor

arXiv.org Artificial Intelligence

Formal thought disorder (FTD), a hallmark of schizophrenia spectrum disorders, manifests as incoherent speech and poses challenges for clinical assessment. Traditional clinical rating scales, though validated, are resource-intensive and lack scalability. Automated speech analysis with automatic speech recognition (ASR) allows for objective quantification of linguistic and temporal features of speech, offering scalable alternatives. The use of utterance timestamps in ASR captures pause dynamics, which are thought to reflect the cognitive processes underlying speech production. However, the utility of integrating these ASR-derived features for assessing FTD severity requires further evaluation. This study integrates pause features with semantic coherence metrics across three datasets: naturalistic self-recorded diaries (AVH, n = 140), structured picture descriptions (TOPSY, n = 72), and dream narratives (PsyCL, n = 43). We evaluated pause related features alongside established coherence measures, using support vector regression (SVR) to predict clinical FTD scores. Key findings demonstrate that pause features alone robustly predict the severity of FTD. Integrating pause features with semantic coherence metrics enhanced predictive performance compared to semantic-only models, with integration of independent models achieving correlations up to \r{ho} = 0.649 and AUC = 83.71% for severe cases detection (TOPSY, with best \r{ho} = 0.584 and AUC = 79.23% for semantic-only models). The performance gains from semantic and pause features integration held consistently across all contexts, though the nature of pause patterns was dataset-dependent. These findings suggest that frameworks combining temporal and semantic analyses provide a roadmap for refining the assessment of disorganized speech and advance automated speech analysis in psychosis.


WavePulse: Real-time Content Analytics of Radio Livestreams

Mittal, Govind, Gupta, Sarthak, Wagle, Shruti, Chopra, Chirag, DeMattee, Anthony J, Memon, Nasir, Ahamad, Mustaque, Hegde, Chinmay

arXiv.org Artificial Intelligence

Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.


NLP Case Study on Predicting the Before and After of the Ukraine-Russia and Hamas-Israel Conflicts

Miner, Jordan, Ortega, John E.

arXiv.org Artificial Intelligence

We propose a method to predict toxicity and other textual attributes through the use of natural language processing (NLP) techniques for two recent events: the Ukraine-Russia and Hamas-Israel conflicts. This article provides a basis for exploration in future conflicts with hopes to mitigate risk through the analysis of social media before and after a conflict begins. Our work compiles several datasets from Twitter and Reddit for both conflicts in a before and after separation with an aim of predicting a future state of social media for avoidance. More specifically, we show that: (1) there is a noticeable difference in social media discussion leading up to and following a conflict and (2) social media discourse on platforms like Twitter and Reddit is useful in identifying future conflicts before they arise. Our results show that through the use of advanced NLP techniques (both supervised and unsupervised) toxicity and other attributes about language before and after a conflict is predictable with a low error of nearly 1.2 percent for both conflicts.


A Simple Illustration of Interleaved Learning using Kalman Filter for Linear Least Squares

John, Majnu, Wu, Yihren

arXiv.org Machine Learning

IL is one of the mechanisms expounded by Complementary Learning Systems Theory (McClelland, McNaughton and O'Reilly, 1995; Marr, 1971) on how successful learners such as human beings mitigate effects of'catastrophic interference' while learning. Recent illustrations of IL using neural networks include Saxena, Shobe and McNaughton, 2022, who exhibited that if the new information is similar to a subset of old items, then deep neural networks can learn the new information rapidly and with the same level of accuracy by interleaving the old items in the subset. A similar insight was presented in McClelland, McNaughton and Lampinen, 2020, where it was shown that for artificial neural networks, information consistent with prior knowledge can sometimes be integrated very quickly. Another recent paper (Ban and Xie, 2021) formulated interleaved machine learning as a multi-level optimization problem, and developed an efficient differentiable algorithm to solve the interleaving learning problem with application to neural architecture search. A closely related biological concept is interleaved replay which also has been empirically validated in the literature (Gepperth and Karaoguz, 2016; Kemker and Kanan, 2018). Over the past couple of decades, ideas inspired by biological IL have been utilized in a wide array of online learning methods as well, especially to prevent catastrophic forgetting. See, for example Wang et.


AdaER: An Adaptive Experience Replay Approach for Continual Lifelong Learning

Li, Xingyu, Tang, Bo, Li, Haifeng

arXiv.org Artificial Intelligence

Continual lifelong learning is an machine learning framework inspired by human learning, where learners are trained to continuously acquire new knowledge in a sequential manner. However, the non-stationary nature of streaming training data poses a significant challenge known as catastrophic forgetting, which refers to the rapid forgetting of previously learned knowledge when new tasks are introduced. While some approaches, such as experience replay (ER), have been proposed to mitigate this issue, their performance remains limited, particularly in the class-incremental scenario which is considered natural and highly challenging. In this paper, we present a novel algorithm, called adaptive-experience replay (AdaER), to address the challenge of continual lifelong learning. AdaER consists of two stages: memory replay and memory update. In the memory replay stage, AdaER introduces a contextually-cued memory recall (C-CMR) strategy, which selectively replays memories that are most conflicting with the current input data in terms of both data and task. Additionally, AdaER incorporates an entropy-balanced reservoir sampling (E-BRS) strategy to enhance the performance of the memory buffer by maximizing information entropy. To evaluate the effectiveness of AdaER, we conduct experiments on established supervised continual lifelong learning benchmarks, specifically focusing on class-incremental learning scenarios. The results demonstrate that AdaER outperforms existing continual lifelong learning baselines, highlighting its efficacy in mitigating catastrophic forgetting and improving learning performance.


Producing Competent HPC Graduates

Communications of the ACM

Computing competency is becoming an essential quality needed by industry. For decades, the gap between baccalaureate computing graduates and industry needs was a discussion topic. Most graduates seek employment in deference to continuing their full-time graduate (master's or doctoral) programs. While the percent of such choice varies by institution, it is estimated that about 5% of computing graduates choose full-time graduate study upon graduation, meaning that 95% of computing graduates seek jobs in business, government, or industry.15 While computing graduates may acquire jobs in today's world, they often lack the competencies (skills and dispositions) expected in the workplace. Most undergraduate computing-degree programs want to produce job-ready graduates who are productive on the first workday. They often seek local advisory boards composed of industry, government, and business representatives to help develop a functional computing curriculum for their students. Information technology and computing disciplines are changing, and new fields appear continuously. Computing curricula and undergraduate programs are challenged to keep up with this rapid change. Employers are looking for competent graduates who can apply the knowledge, skill, and culture they acquire in college to solve problems as soon as they enter the workforce. High-performance computing (HPC) and parallel and distributed computing (PDC) have become pervasive.


A novel nonconvex, smooth-at-origin penalty for statistical learning

John, Majnu, Vettam, Sujit, Wu, Yihren

arXiv.org Machine Learning

Nonconvex penalties are utilized for regularization in high-dimensional statistical learning algorithms primarily because they yield unbiased or nearly unbiased estimators for the parameters in the model. Nonconvex penalties existing in the literature such as SCAD, MCP, Laplace and arctan have a singularity at origin which makes them useful also for variable selection. However, in several high-dimensional frameworks such as deep learning, variable selection is less of a concern. In this paper, we present a nonconvex penalty which is smooth at origin. The paper includes asymptotic results for ordinary least squares estimators regularized with the new penalty function, showing asymptotic bias that vanishes exponentially fast. We also conducted an empirical study employing deep neural network architecture on three datasets and convolutional neural network on four datasets. The empirical study showed better performance for the new regularization approach in five out of the seven datasets.


Regularized deep learning with a non-convex penalty

Vettam, Sujit, John, Majnu

arXiv.org Machine Learning

Regularization methods are often employed in deep learning neural networks (DNNs) to prevent overfitting. For penalty based methods for DNN regularization, typically only convex penalties are considered because of their optimization guarantees. Recent theoretical work have shown that non-convex penalties that satisfy certain regularity conditions are also guaranteed to perform well with standard optimization algorithms. In this paper, we examine new and currently existing non-convex penalties for DNN regularization. We provide theoretical justifications for the new penalties and also assess the performance of all penalties on DNN analysis of real datasets. Introduction The success of DNNs in learning complex relationships between the inputs and outputs may be mainly attributed to multiple nonlinear hidden layers [1,2]. Corresponding author, address: 350 Community Drive, Manhasset, NY 11030. Such large number of parameters gives the method incredible amount of flexibility. However on the downside, this may lead to overfitting the data, especially if the training sample is not large enough.


First Clinton-Trump matchup breaks presidential debate record with about 84 million TV viewers

Los Angeles Times

The contentious first presidential debate between Hillary Clinton and Donald Trump lived up to its big ratings expectations with an estimated average TV viewership that will top the previous record of 80.6 million. The total average audience for Monday's matchup for the ad-supported broadcast and cable networks as well as PBS came in at about 84 million, according to Nielsen numbers. Monday's faceoff tops the previous record for a presidential debate set when Jimmy Carter and Ronald Reagan clashed on Oct. 28, 1980. It was their only meeting of that year's presidential campaign, which occurred in an era when U.S. households had only a handful of channels to choose from. The total across broadcast and cable networks measured by Nielsen for the Clinton-Trump debate does not include viewers who watched the debate through various video streams available online.


Flipboard on Flipboard

#artificialintelligence

The first presidential debate between Hillary Clinton and Donald Trump is in the books. I tweeted, took notes and picked some winners and losers. At times she came across as overly rehearsed and robotic. This week, the advertising world converges on New York City to discuss the industry and its ongoing changes. The team at Advertising Week has created a program full of interesting speakers and topics, including brand storytelling, mobile advertising and diversity.